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ABSTRACT 


Authentication is the process by which the identity of an individual is verified. 
Voice authentication is the verification of identity based on the analysis of an 
individual's voice. Voice authentication has various advantages, but it is 
seldom implemented due its shortcomings as compared to other forms of 
biometric authentication. In this paper we have discussed about the approach 
for the implementation of voice authentication system through the 
combination of OTP to increase its real world applicability and reduce its 
shortcomings. 
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I. INTRODUCTION: 

In recent times the most popular type of authentication is 
password authentication, but with the increasing number of 
attacks by hackers it becomes essential to set a strong 
password. The strength of a password is equivalent to its 
complexity, but due to increase in complexity, it becomes 
very difficult to remember such passwords. So alternative 
forms of authentication which use biometrics are very useful 
as the user does not have to remember anything to 
authenticate his or her identity. 

Biometric authentication such as fingerprints and iris 
biometrics require very specialized equipment and are 
therefore costly and difficult to implement whereas voice 
authentication require minimal amount of equipment and 
also very easy to implement. But voice authentication can be 
broken with help of voice recordings which can obtained 
fairly easily. So the purpose of this paper is to overcome the 
shortcoming of voice authentication with OTP so that it is 
resistant to attacks with recorded voice. 

II. EXISTING SYSTEM 

In voice authentication the first step is to input the audio 
data into the system and convert it in to format in which it 
can be processed. In this method we take the sound waves 
which are one dimensional in nature and turn it in into 
numbers by recording the height of the wave at equally 
spaced points. This process is known as sampling. This 
sampled data is stored into the system. 
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In the next step, the stored audio data is pre-processed so 
that it easier for the neural network to recognize patterns in 
it. This can be done in a variety of ways depending on neural 
network used. The audio data is represented in various 
formats for example it can be represented in the form of a 
spectrogram, which is the visual representation of audio 
data. 



In the next step, we extract features from this pre-processed 
audio data so that the neural network can analyze them and 
recognize patterns from them. These patterns are used to 
match and compare different types of audio data. 

After the extraction of features, it is sent into the neural 
network. A number of samples of the user's voice is recorded 
and then sent into the neural network model to train it and 
find similar patterns which is used to recognize the user's 
voice during voice authentication. 
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Although this type of voice authentication has various 
advantages the main drawback from the security point of 
view is that it cannot differentiate between a recorded voice 
played from an audio voice and the actual user. The system 
only recognizes feature patterns from the audio, which is 
also present in audio played through audio devices. This is a 
huge problem for an authentication system, as the identity of 
the user, can be easily spoofed with the help of a recording of 
the user's voice. 

III. SOLUTION 

Considering the drawback mentioned above, a possible 
solution is the implementation OTP along with voice 
authentication. In this method, first during the registration 
process the user is requested to recite numbers from zero to 
nine, one by one a number of times. This data, is recorded as 
ten different inputs and then sent into the algorithm, to train 
it so that it can recognize the voice pattern of the user. After 


the registration process is completed, the user can login 
using his voice. In the login process, the user sends an 
authentication request, after receiving this request the 
program generates a random four digit unique code, which is 
displayed on the screen. Then unique code or OTP is recited 
by the user. When the OTP is recited by the user, it is taken 
as input into the system. Each digit recited by the user is 
input into the previously trained algorithm one by one, the 
pattern of the voice is matched with the pattern of the user's 
voice. If the voice pattern of each of the digit matches with 
the voice pattern of the user, the next step is initiated where 
each digit is converted from speech to text format, this text is 
the matched with the previously generated OTP. If the text 
matches with the previously generated OTP, then the 
authentication process is complete and the user is granted 
access. 



This implementation is a very effective solution because the 
attacker cannot access the system with a pre-recorded audio 
as the voice passphrase changes every time. As each digit of 
the passphrase, is converted into text and compared with the 
generated OTP, a previously recorded audio of the user will 
have the same voice pattern but after it has been converted 
to text it will not match currently generated OTP, thereby 
preventing identity spoofing. 
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